Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format changes #773

Merged
merged 7 commits into from
Jul 31, 2024
Merged

Format changes #773

merged 7 commits into from
Jul 31, 2024

Conversation

wx20jjung
Copy link
Contributor

@wx20jjung wx20jjung commented Jul 11, 2024

Description
The next generation polar satellites from EUMETSAT, metop-sg-a?, have more characters (11) in their name than the GSI is currently coded to use (10). Also, the new IASI instrument, IASI-NG, has more than 10,000 channels. The current GSI uses I4 in formats for write statements, These changes increase this to I5.

Without these changes issues will arise with various files like the satinfo file when the new satellites are added. There are other internal issues like generating the radstat files and loading the correct coefficient files for the CRTM, where the name would be clipped and potentially conflict with follow-on satellites e.g. iasi-ng_metop-sg-a1 would conflict with iasi-ng_metop-sg-a2 because the 1 and 2 in the name would be clipped.

I had concerns about how to best code the character() changes. See discussion #769
I followed Innocent Souopgui's comments in my coding.

The changes in this PR are format changes only in preparation for metop-sg-a? This fixes #770.
This PR also fixes #775

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

I have tested these changes on S4 with proxy data for IASI-NG_metop-sg-a1 in a short, 2 cycle experiment. I added IASI-NG channels to the satinfo file and checked the various outputs (bias correction files, gsistat file output radstat file names, and etc. Current satellites work properly and all of the new satellite and channel number format problems I could find are fixed. These changes should not change results with the current satellites and have not in my current tests.

My ctests on hera passed. Both rrfs ctests on jet (develop and update) failed. All other tests on jet passed. The GSI is not able to fully test these changes until supporting read subroutines for metop-sg-a? are included.

My jet statistics:
1/6 Test #3: rrfs_3denvar_rdasens .............***Failed 1980.84 sec
2/6 Test #2: rtma ............................. Passed 3552.39 sec
3/6 Test #5: hafs_3denvar_hybens .............. Passed 3558.76 sec
4/6 Test #4: hafs_4denvar_glbens .............. Passed 3737.34 sec
5/6 Test #6: global_enkf ...................... Passed 3935.60 sec
6/6 Test #1: global_4denvar ................... Passed 3970.37 sec

My hera statistics:
1/6 Test #3: rrfs_3denvar_rdasens ............. Passed 1637.22 sec
2/6 Test #6: global_enkf ...................... Passed 2087.80 sec
3/6 Test #1: global_4denvar ................... Passed 2929.82 sec
4/6 Test #5: hafs_3denvar_hybens .............. Passed 18541.04 sec
5/6 Test #4: hafs_4denvar_glbens .............. Passed 18605.18 sec
6/6 Test #2: rtma ............................. Passed 18648.58 sec

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

@wx20jjung
Copy link
Contributor Author

@RussTreadon-NOAA I am not able to assign reviewers. I suggest @emilyhcliu as she will be working on the metop-sg-a1 microwave sounder. If anyone else plans to work on a metop-sg-a1 instrument, they may want to participate. Other potential reviewers would be Andrew Collard, Dave Huber or Innocent Souopgui.

FYI the branch name is "format_changes"

@RussTreadon-NOAA
Copy link
Contributor

@wx20jjung : @emilyhcliu has been added as a reviewer. Please reach out to the potential reviewers you listed to see who else can review your changes. GSI PRs need to peer reviews and approvals.

@wx20jjung
Copy link
Contributor Author

@ADCollard, @DavidHuber-NOAA @InnocentSouopgui-NOAA

Would any of you be willing to review my "format_changes" branch? This is my changes for adding metop-sg-a1 instruments.

@ADCollard
Copy link
Contributor

@ADCollard, @DavidHuber-NOAA @InnocentSouopgui-NOAA

Would any of you be willing to review my "format_changes" branch? This is my changes for adding metop-sg-a1 instruments.

Happy to

@RussTreadon-NOAA
Copy link
Contributor

Thank you @ADCollard

@ADCollard
Copy link
Contributor

@wx20jjung Is there any reason why we are only increasing the length by one character. Maybe that is enough, but is it worth making it a little larger to reduce the risk of doing this exercise again?

@InnocentSouopgui-NOAA
Copy link
Contributor

@wx20jjung Is there any reason why we are only increasing the length by one character. Maybe that is enough, but is it worth making it a little larger to reduce the risk of doing this exercise again?

I second this.

@InnocentSouopgui-NOAA
Copy link
Contributor

@ADCollard, @DavidHuber-NOAA @InnocentSouopgui-NOAA

Would any of you be willing to review my "format_changes" branch? This is my changes for adding metop-sg-a1 instruments.

I will review.

@InnocentSouopgui-NOAA
Copy link
Contributor

This is not a big concern, but a flag that can slow down the review:
in read_obs, I see that the temporary variable platid is removed and dplat(i) is used directly instead. Since platid is used a lot of times, it shows up as an invasive change.
It is best to leave this type of change for cleanup PR.

@wx20jjung
Copy link
Contributor Author

I have 2 reasons for only increasing the length by 1. The GSI sunset date is approaching and metop-sg-a? could be the last satellite added, especially with such a long name. The other reason, the current length of 10 has been adequate for decades. I can make it any length as long as there is a consensus.

My goal was to make changing the length of all the variables involved simple. In this case the original variable is dplat(). dplat() is a public (common) variable and can be added with a use statement. I originally tried to make the length of platid dependent on the length of dplat(). My tests did not work. The alternative was to change platid to dplat.

Copy link
Collaborator

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, with one suggestion.

As for the lengths, I'm OK with 11, but wouldn't want an extreme number (>20) so the printouts to the stat files aren't difficult to read.

src/gsi/statsrad.f90 Show resolved Hide resolved
@wx20jjung
Copy link
Contributor Author

@DavidHuber-NOAA

In the format change:
-1115 format('o-g',1x,i2.2,1x,'rad',2x,2A10,2x,3(i11,2x),4(g12.5,1x))
+1115 format('o-g',1x,i2.2,1x,'rad',2x,2A12,2x,3(i11,2x),4(g12.5,1x))

The 2A12 does leave at least one space between them.

@DavidHuber-NOAA
Copy link
Collaborator

@wx20jjung Agreed. Just a suggestion since dytype is only length 10. I'm fine with it either way.

Copy link
Contributor

@InnocentSouopgui-NOAA InnocentSouopgui-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main comment:

in src/gsi/read_diag.f90
was the changes here tested with binary files? I didn't see where those files are created to check if the creation is updated accordingly.
for instance subroutine read_radiag_header_bin reads record including satid that the size just changed from 10 to 11.

in src/gsi/statsrad.f90
Good catch (A16 => A20) for the 3rd element in the format # 1102, 16 was smaller than the size of the variable
Great update (2a10 to 2A12) in format # 1115, ensuring a space for readability

@wx20jjung
Copy link
Contributor Author

@InnocentSouopgui-NOAA I am not sure how to test the _bin (binary) option. The defaults are netcdf. Any suggestions?

In setuprad.f90 the header is written by subroutine init_binary_diag_
The open statement is:
diag_rad_file= trim(dirname) // trim(filex) // '_' // trim(dplat(is)) // trim(string)
if(init_pass) then
open(4,file=trim(diag_rad_file),form='unformatted',status='unknown',position='rewind')
else
open(4,file=trim(diag_rad_file),form='unformatted',status='old',position='append')
endif

A little further down the write statement is:
write(4) isis,dplat(is),obstype,jiter,nchanl_diag,npred,ianldate,ireal_radiag,ipchan_radiag,iextra,jextra,&
idiag,angord,iversion_radiag,inewpc,ioff0,ijacob
Here dplat is from obsmod

To be consistent, satid should be the same length as dplat().

@InnocentSouopgui-NOAA
Copy link
Contributor

Thanks @wx20jjung, That write in setuprad solves my concern.

Copy link
Contributor

@ADCollard ADCollard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this too

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 (Dogwood) ctests
Install wx20jjung:format_changes on Dogwood. Run ctests with following results.

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr773/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  730.15 sec
2/6 Test #6: global_enkf ......................   Passed  860.31 sec
3/6 Test #2: rtma .............................   Passed  974.11 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1155.31 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1274.71 sec
6/6 Test #1: global_4denvar ...................   Passed  1743.10 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1743.13 sec

This is an expected result ... still nice to get confirmation of no failures.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes peer reviewed and approved. Ctests run with acceptable results.

Approve.

@RussTreadon-NOAA
Copy link
Contributor

Hercules ctests
Install format_changes at 0db28da on Hercules. Run ctests with the following results

Test project /work/noaa/da/rtreadon/git/gsi/pr773/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  785.75 sec
2/6 Test #6: global_enkf ......................   Passed  1086.83 sec
3/6 Test #2: rtma .............................   Passed  1386.58 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1452.31 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1639.43 sec
6/6 Test #1: global_4denvar ...................   Passed  3241.30 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3241.31 sec

Did not run ctests on Orion due to (1) problem with rrfs_3denvar_rdasens (issue #766) and (2) increased Orion run times following Rocky 9 upgrade (issue #771)

@RussTreadon-NOAA
Copy link
Contributor

@wx20jjung , do we need to make any changes in src/enkf to account for the longer satellite name?

I see the following
on line 1096 src/enkf/readsatobs.f90

  character(len=10):: satid,sentype

There are other len=10 variables in readsatobs.f90 ... and possibly other enkf source code files ... which need adjustment.

@wx20jjung
Copy link
Contributor Author

@RussTreadon-NOAA Your examples are of concern. I did not look through src/enkf and should have. I will review the src/enfk files next week and make the necessary changes.

@RussTreadon-NOAA
Copy link
Contributor

Thank you @wx20jjung for checking code in src/enkf to see which files need to be updated. Getting to this next week is fine.

@wx20jjung
Copy link
Contributor Author

wx20jjung commented Jul 23, 2024 via email

@RussTreadon-NOAA
Copy link
Contributor

Thank you @wx20jjung for the update. Please commit the necessary changes to wx20jjung:format_changes once your testing is complete. Feel free to also fold changes for issue #775 into this PR.

… IASI (AVHRR) to avoid zero values for cluster size and radiance values. I also removed an unused variable.
@RussTreadon-NOAA
Copy link
Contributor

Thank you @wx20jjung for updating format_changes with changes from issue #775. I'll hold off on Cactus testing until enkf changes are tested and committed to format_changes.

@RussTreadon-NOAA
Copy link
Contributor

@wx20jjung , please commit changes to src/enkf to wx20jjung:format_changes. Once this is done, we can do one more round of ctests. We can then schedule this PR for merger into develop pending successful ctests results.

@wx20jjung
Copy link
Contributor Author

wx20jjung commented Jul 29, 2024 via email

@wx20jjung
Copy link
Contributor Author

wx20jjung commented Jul 29, 2024 via email

@RussTreadon-NOAA
Copy link
Contributor

Thank you @wx20jjung.

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 test
Install wx20jjung:format_changes at 4ba610b on Cactus. Build app in Release mode and run ctests with following results

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr773/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  726.97 sec
2/6 Test #6: global_enkf ......................   Passed  850.58 sec
3/6 Test #2: rtma .............................   Passed  968.53 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1153.91 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1212.61 sec
6/6 Test #1: global_4denvar ...................   Passed  1682.41 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1682.42 sec

@RussTreadon-NOAA
Copy link
Contributor

RussTreadon-NOAA commented Jul 30, 2024

WCOSS2 debug test
Install wx20jjung:format_changes at 4ba610b on Cactus. Build app in Debug mode. Run loproc_updat configuration for each ctest with the following results

global_4denvar
Debug gsi.x runs to completion

     ENDING DATE-TIME    JUL 30,2024  12:26:12.627  212  TUE   2460522
     PROGRAM GSI_ANL HAS ENDED.
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
*****************RESOURCE STATISTICS*******************************
The total amount of wall time                        = 8803.247047
The total amount of time in user mode                = 8776.590204
The total amount of time in sys mode                 = 9.827523
The maximum resident set size (KB)                   = 1310124
Number of page faults without I/O activity           = 430497
Number of page faults with I/O activity              = 32
Number of times filesystem performed INPUT           = 3254130
Number of times filesystem performed OUTPUT          = 1285024
Number of Voluntary Context Switches                 = 12057
Number of InVoluntary Context Switches               = 2467
*****************END OF RESOURCE STATISTICS*************************

global_enkf
Debug enkf.x aborts with

forrtl: error (65): floating invalid
Image              PC                Routine            Line        Source
enkf.x             0000000004EEBB4B  Unknown               Unknown  Unknown
libpthread-2.31.s  000015074A6528C0  Unknown               Unknown  Unknown
enkf.x             000000000053E985  letkf_mp_letkf_up         279  letkf.f90
libiomp5.so        000015074BA9B3F3  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        000015074BA1F937  __kmp_fork_call       Unknown  Unknown
libiomp5.so        000015074B9E3533  __kmpc_fork_call      Unknown  Unknown
enkf.x             00000000005383F1  letkf_mp_letkf_up         279  letkf.f90
enkf.x             00000000004147CE  MAIN__                    210  enkf_main.f90

This failure is reported in issue #776. The same failure occurs when running the debug develop enkf.x. This failure is not due to changes in this PR. Resolution of this failure will be tracked via issue #776

hafs_3denvar_hybens and hafs_4denvar_glbens
Debug gsi.x aborts with

forrtl: error (182): floating invalid - possible uninitialized real/complex variable.
Image              PC                Routine            Line        Source
gsi.x              0000000007F062EB  Unknown               Unknown  Unknown
libpthread-2.31.s  000015425BC6A8C0  Unknown               Unknown  Unknown
libmpi_intel.so.1  0000154257C3FF75  Unknown               Unknown  Unknown
libmpi_intel.so.1  0000154256491409  Unknown               Unknown  Unknown
libmpi_intel.so.1  0000154257BFEEC8  Unknown               Unknown  Unknown
libmpi_intel.so.1  0000154256381F2A  Unknown               Unknown  Unknown
libmpi_intel.so.1  00001542563DEF20  Unknown               Unknown  Unknown
libmpi_intel.so.1  00001542563E0B98  PMPI_Reduce           Unknown  Unknown
libmpifort_intel.  000015425C33E7CE  MPI_REDUCE            Unknown  Unknown
gsi.x              000000000379F72C  combine_radobs_           137  combine_radobs.f90
gsi.x              000000000522C67E  read_iasi_                984  read_iasi.f90
gsi.x              0000000001C4811C  read_obsmod_mp_re        1757  read_obs.F90
gsi.x              00000000017634C9  observermod_mp_se         329  observer.F90
gsi.x              0000000003F4370A  glbsoi_                   222  glbsoi.f90
gsi.x              000000000108EA6C  gsisub_                   200  gsisub.F90
gsi.x              000000000042CBB5  gsimod_mp_gsimain        2431  gsimod.F90
gsi.x              0000000000413B3B  MAIN__                    633  gsimain.f90

This failure is not due to changes in this PR. The debug develop gsi.x fails with the same traceback. @dkokron found that initialzing data_all in read_iasi.f90 resolves this failure.

rtma
Debug gsi.x aborts with

forrtl: error (65): floating invalid
Image              PC                Routine            Line        Source
gsi.x              0000000007F062EB  Unknown               Unknown  Unknown
libpthread-2.31.s  00001553415EC8C0  Unknown               Unknown  Unknown
gsi.x              000000000361B280  anisofilter_mp_ge        5418  anisofilter.f90
gsi.x              000000000340492A  anisofilter_mp_an         456  anisofilter.f90
gsi.x              0000000003F43A0A  glbsoi_                   254  glbsoi.f90
gsi.x              000000000108EA6C  gsisub_                   200  gsisub.F90
gsi.x              000000000042CBB5  gsimod_mp_gsimain        2431  gsimod.F90
gsi.x              0000000000413B3B  MAIN__                    633  gsimain.f90

The same error occurs when running the debug develop gsi.x. This failure is not due to changes in the PR.

rrfs_3denvar_rdasens_
Debug gsi.x successfully runs to completion.

     ENDING DATE-TIME    JUL 30,2024  10:12:19.301  212  TUE   2460522
     PROGRAM GSI_ANL HAS ENDED.
* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .
*****************RESOURCE STATISTICS*******************************
The total amount of wall time                        = 783.208786
The total amount of time in user mode                = 773.247529
The total amount of time in sys mode                 = 6.648167
The maximum resident set size (KB)                   = 547424
Number of page faults without I/O activity           = 66050
Number of page faults with I/O activity              = 20
Number of times filesystem performed INPUT           = 2279936
Number of times filesystem performed OUTPUT          = 187120
Number of Voluntary Context Switches                 = 8211
Number of InVoluntary Context Switches               = 352
*****************END OF RESOURCE STATISTICS*************************

@RussTreadon-NOAA RussTreadon-NOAA self-requested a review July 31, 2024 00:13
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve.

@RussTreadon-NOAA RussTreadon-NOAA merged commit de74bb2 into NOAA-EMC:develop Jul 31, 2024
4 checks passed
DavidHuber-NOAA added a commit to DavidHuber-NOAA/GSI that referenced this pull request Sep 6, 2024
* origin/develop:
  Move to contrib spack-stack on Jet (NOAA-EMC#787)
  a quick workaround for increasing the mpi task numbers on orion for ctest :: rrfs_3denvar_rdasens  (NOAA-EMC#788)
  Recover the capability of handling model fields from operation gfs.v16.3 (NOAA-EMC#785)
  fix a bug in deter_sfc_gmi (NOAA-EMC#781)
  add safeguard to thompson_reff (NOAA-EMC#779)
  Fix incorrect usage of real(i_kind) in mg_input.f90  (NOAA-EMC#760)
  Transition to Thompson Microphysics for Microwave All-sky Assimilation (NOAA-EMC#743)
  Format changes for EUMETSAT metop-sg and CADS debug fix (NOAA-EMC#773)
  Update global_4denvar and global_enkf ctests to reflect GFS v17 (NOAA-EMC#774)
  fix for cris-fsr memory corruption (NOAA-EMC#767)
  Gnssrwnd1.0 (NOAA-EMC#747)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CADS code seg faults in debug mode format changes for metop-sg-a1 and iasi-ng
6 participants